setwd("/Users/gustavoacosta/Desktop/5 semestre/intro econometrics/datasets1")
dir()
## [1] "~$updated_cocasales_data.xlsx" "updated_cocasales_data.xlsx"
cocacolasales_new <- read_excel("updated_cocasales_data.xlsx")
cocacolasales_new2 <- read_excel("updated_cocasales_data.xlsx", sheet = "tseries3 - GDL" )
How time series Analysis is useful to forecast an outcome?
Within this second evidence we will be employing Time Series Analysis, a useful tool for forecasting,in what way?, well it is use to forecast because it shows how the data changes over lapse of time as well we can identify in which direction the data is changing and observe trends from the collected data from specific unit of analysis over period of time.
Our Problem Situation is based on Coca-Cola Femsa , a multinational Mexican enterprise that takes part of the beverages industry,currently they are one of the biggest bottling companies in Latin America, they offer their services in 10 Latin countries and the Philippines. According to their Financial reports from 2015 to 2018, shows that they’ve had the biggest quantity of sales from unit boxes during March and May 2018, and the lowest in Jan-Mar 2017, during the last year of the financial report the sales haven’t been so consistent.
In this evidence our main objective is to analyze how the seasonal phenomena affects the sales of unit sales boxes and how does it respond to this seasons such as summer and winter, etc. We will create regression models and time series based on the behavior of certain components and select a predictive model in order to help us to estimate sales taking in consideration the different components of a time series data.
Dependent Variable:
Independent Variables: - Date (time)
The data set we will be using is about coca cola femsa through the years 2015- 2018 but is divided monthly, where we have our dependent variable, that we take in special consideration for forecasting, unit sales of boxes, we also have other independent variables we can analyze seasonally like the weather.
cocacolasales_new$date=as.yearmon(cocacolasales_new$date,format="%Y/%m")
Here we have a quick overview from the data set, where we will find the descriptive statistics from the data set such as mean, mean absolute difference of the variables, datatypes and nulls.
basic_eda <- function(data)
{
glimpse(data)
df_status(data)
freq(data)
profiling_num(data)
plot_num(data)
describe(data)
}
basic_eda(cocacolasales_new)
## Rows: 48
## Columns: 2
## $ date <yearmon> Jan 2015, Feb 2015, Mar 2015, Apr 2015, May 201…
## $ ccsales_unit_boxes <dbl> 5516689, 5387496, 5886747, 6389182, 6448275, 669794…
## variable q_zeros p_zeros q_na p_na q_inf p_inf type unique
## 1 date 0 0 0 0 0 0 yearmon 48
## 2 ccsales_unit_boxes 0 0 0 0 0 0 numeric 48
## Warning in freq(data): None of the input variables are factor nor character
## Warning: attributes are not identical across measure variables; they will be
## dropped
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
## data
##
## 2 Variables 48 Observations
## --------------------------------------------------------------------------------
## date
## n missing distinct
## 48 0 48
##
## lowest : Jan 2015 Feb 2015 Mar 2015 Apr 2015 May 2015
## highest: Aug 2018 Sep 2018 Oct 2018 Nov 2018 Dec 2018
## --------------------------------------------------------------------------------
## ccsales_unit_boxes
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 6473691 680321 5491459 5576844
## .25 .50 .75 .90 .95
## 6171767 6461357 6819782 7288957 7396022
##
## lowest : 5301755 5387496 5477874 5516689 5568552
## highest: 7330137 7345037 7423475 7457473 7963063
## --------------------------------------------------------------------------------
Then, we will see some visualizations as plots for the first sheet of our data set which contains, two variables sales unit boxes and the date monthly from 2015-2018
Plot 1
In this plot we can see the Sales Unit boxes from Coca Cola and we observe a path that changes over time, we observe some decreasing peaks in several years that affects the constant behavior of the unit sales boxes in all four years from the time series data.
plot(cocacolasales_new$date,cocacolasales_new$ccsales_unit_boxes, type="l",col="blue", lwd=2, xlab ="Date",ylab ="Sales", main = "Coca Cola Femsa Sales Unit")
Alternative TS Plot 2
Here is an alternative plot to analyze the behavior of the unit sales boxes. The difference form this plot and the last one is we can see more specifically the dates by months an identify more easily identify the low peaks that impact on the sales for example around February 2016,2017 and 2018
plot1xts<-xts(cocacolasales_new$ccsales_unit_boxes,order.by=cocacolasales_new$date)
plot(plot1xts)
This second set of visualizations is for the data set we will be using for the model VAR, here we have a quick overview of the variables we have and sum descriptive statistics for each column.
Dependent Variable:
Independent Variables:
basic_eda <- function(data)
{
glimpse(data)
df_status(data)
freq(data)
profiling_num(data)
plot_num(data)
describe(data)
}
basic_eda(cocacolasales_new2)
## Rows: 48
## Columns: 15
## $ date <chr> "2015/01", "2015/02", "2015/03", "2015/04", "2015/0…
## $ sales_unitboxes <dbl> 5516689, 5387496, 5886747, 6389182, 6448275, 669794…
## $ consumer_sentiment <dbl> 38.06250, 37.49114, 38.50522, 37.84286, 38.03169, 3…
## $ CPI <dbl> 87.11010, 87.27538, 87.63072, 87.40384, 86.96737, 8…
## $ inflation_rate <dbl> -0.09, 0.19, 0.41, -0.26, -0.50, 0.17, 0.15, 0.21, …
## $ unemp_rate <dbl> 0.05230256, 0.05311320, 0.04608844, 0.05102038, 0.0…
## $ gdp_percapita <dbl> 11659.56, 11659.55, 11659.55, 11625.75, 11625.74, 1…
## $ itaee <dbl> 103.7654, 103.7654, 103.7654, 107.7518, 107.7518, 1…
## $ itaee_growth <dbl> 0.049716574, 0.049716574, 0.049716574, 0.031838981,…
## $ pop_density <dbl> 98.54185, 98.54186, 98.54187, 98.82843, 98.82844, 9…
## $ job_density <dbl> 18.26048, 18.46329, 18.64164, 18.67876, 18.67539, 1…
## $ pop_minwage <dbl> 9.657861, 9.657861, 9.657861, 9.594919, 9.594919, 9…
## $ exchange_rate <dbl> 14.69259, 14.92134, 15.22834, 15.22618, 15.26447, 1…
## $ max_temperature <dbl> 28, 31, 29, 32, 34, 32, 29, 29, 29, 29, 29, 26, 28,…
## $ holiday_month <dbl> 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, …
## variable q_zeros p_zeros q_na p_na q_inf p_inf type unique
## 1 date 0 0 0 0 0 0 character 48
## 2 sales_unitboxes 0 0 0 0 0 0 numeric 48
## 3 consumer_sentiment 0 0 0 0 0 0 numeric 48
## 4 CPI 0 0 0 0 0 0 numeric 48
## 5 inflation_rate 0 0 0 0 0 0 numeric 41
## 6 unemp_rate 0 0 0 0 0 0 numeric 48
## 7 gdp_percapita 0 0 0 0 0 0 numeric 48
## 8 itaee 0 0 0 0 0 0 numeric 16
## 9 itaee_growth 0 0 0 0 0 0 numeric 16
## 10 pop_density 0 0 0 0 0 0 numeric 48
## 11 job_density 0 0 0 0 0 0 numeric 45
## 12 pop_minwage 0 0 0 0 0 0 numeric 16
## 13 exchange_rate 0 0 0 0 0 0 numeric 48
## 14 max_temperature 0 0 0 0 0 0 numeric 12
## 15 holiday_month 36 75 0 0 0 0 numeric 2
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
## data
##
## 15 Variables 48 Observations
## --------------------------------------------------------------------------------
## date
## n missing distinct
## 48 0 48
##
## lowest : 2015/01 2015/02 2015/03 2015/04 2015/05
## highest: 2018/08 2018/09 2018/10 2018/11 2018/12
## --------------------------------------------------------------------------------
## sales_unitboxes
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 6473691 680321 5491459 5576844
## .25 .50 .75 .90 .95
## 6171767 6461357 6819782 7288957 7396022
##
## lowest : 5301755 5387496 5477874 5516689 5568552
## highest: 7330137 7345037 7423475 7457473 7963063
## --------------------------------------------------------------------------------
## consumer_sentiment
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 37.15 3.041 33.93 34.63
## .25 .50 .75 .90 .95
## 35.64 36.76 38.14 41.81 42.84
##
## lowest : 28.66787 31.51561 33.79513 34.18934 34.33673
## highest: 42.13270 42.53301 43.00569 43.34109 44.86544
## --------------------------------------------------------------------------------
## CPI
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 93.4 5.811 87.16 87.37
## .25 .50 .75 .90 .95
## 89.18 92.82 98.40 100.08 101.26
##
## lowest : 86.96737 87.11010 87.11311 87.24082 87.27538
## highest: 100.49200 100.91700 101.44000 102.30300 103.02000
## --------------------------------------------------------------------------------
## inflation_rate
## n missing distinct Info Mean Gmd .05 .10
## 48 0 41 0.999 0.3485 0.4164 -0.3330 -0.1900
## .25 .50 .75 .90 .95
## 0.1650 0.3850 0.5575 0.6510 0.8255
##
## lowest : -0.50 -0.45 -0.34 -0.32 -0.26, highest: 0.70 0.78 0.85 1.03 1.70
## --------------------------------------------------------------------------------
## unemp_rate
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 0.04442 0.006762 0.03648 0.03747
## .25 .50 .75 .90 .95
## 0.04010 0.04369 0.04897 0.05373 0.05413
##
## lowest : 0.03466221 0.03587220 0.03641392 0.03659655 0.03677829
## highest: 0.05383592 0.05394831 0.05423057 0.05473379 0.05517447
## --------------------------------------------------------------------------------
## gdp_percapita
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 11979 287.3 11570 11592
## .25 .50 .75 .90 .95
## 11830 12014 12162 12297 12318
##
## lowest : 11558.59 11558.59 11558.59 11591.89 11591.89
## highest: 12296.98 12296.98 12329.04 12329.04 12329.05
##
## Value 11558 11592 11626 11660 11886 11920 11954 11988 12040 12072 12106
## Frequency 3 3 3 3 3 3 3 3 3 3 3
## Proportion 0.062 0.062 0.062 0.062 0.062 0.062 0.062 0.062 0.062 0.062 0.062
##
## Value 12138 12232 12264 12296 12330
## Frequency 3 3 3 3 3
## Proportion 0.062 0.062 0.062 0.062 0.062
##
## For the frequency table, variable is rounded to the nearest 2
## --------------------------------------------------------------------------------
## itaee
## n missing distinct Info Mean Gmd .05 .10
## 48 0 16 0.997 113.9 5.423 105.2 107.8
## .25 .50 .75 .90 .95
## 111.5 113.5 117.1 119.8 121.5
##
## lowest : 103.7654 107.7518 108.7077 110.5957 111.7800
## highest: 117.0615 117.3254 118.9366 119.7875 122.4821
##
## 103.7653537 (3, 0.062), 107.7518353 (3, 0.062), 108.7076578 (3, 0.062),
## 110.5956589 (3, 0.062), 111.779976 (3, 0.062), 111.7936456 (3, 0.062),
## 112.6669197 (3, 0.062), 113.2336441 (3, 0.062), 113.7050866 (3, 0.062),
## 115.672343 (3, 0.062), 116.373846 (3, 0.062), 117.0614999 (3, 0.062),
## 117.325411 (3, 0.062), 118.9365563 (3, 0.062), 119.7875356 (3, 0.062),
## 122.4821017 (3, 0.062)
## --------------------------------------------------------------------------------
## itaee_growth
## n missing distinct Info Mean Gmd .05 .10
## 48 0 16 0.997 0.03174 0.0167 0.006355 0.007811
## .25 .50 .75 .90 .95
## 0.022376 0.029977 0.043038 0.049717 0.054149
##
## lowest : 0.005570848 0.007811483 0.021536875 0.022021360 0.022494545
## highest: 0.041634475 0.047249285 0.047629618 0.049716574 0.056536274
##
## 0.005570848 (3, 0.062), 0.007811483 (3, 0.062), 0.021536875 (3, 0.062),
## 0.02202136 (3, 0.062), 0.022494545 (3, 0.062), 0.02328721 (3, 0.062),
## 0.023470888 (3, 0.062), 0.028115278 (3, 0.062), 0.031838981 (3, 0.062),
## 0.037510361 (3, 0.062), 0.041347463 (3, 0.062), 0.041634475 (3, 0.062),
## 0.047249285 (3, 0.062), 0.047629618 (3, 0.062), 0.049716574 (3, 0.062),
## 0.056536274 (3, 0.062)
## --------------------------------------------------------------------------------
## pop_density
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 100.6 1.502 98.64 98.83
## .25 .50 .75 .90 .95
## 99.61 100.67 101.69 102.43 102.60
##
## lowest : 98.54185 98.54186 98.54187 98.82843 98.82844
## highest: 102.42910 102.42912 102.69447 102.69449 102.69450
## --------------------------------------------------------------------------------
## job_density
## n missing distinct Info Mean Gmd .05 .10
## 48 0 45 1 20.38 1.455 18.64 18.68
## .25 .50 .75 .90 .95
## 19.28 20.39 21.60 21.93 22.10
##
## lowest : 18.26048 18.46329 18.64164 18.64668 18.67539
## highest: 21.97487 22.03936 22.13799 22.24837 22.36215
## --------------------------------------------------------------------------------
## pop_minwage
## n missing distinct Info Mean Gmd .05 .10
## 48 0 16 0.997 11.12 1.114 9.467 9.595
## .25 .50 .75 .90 .95
## 10.794 11.139 11.413 12.722 12.920
##
## lowest : 9.398393 9.594919 9.657861 10.675655 10.833710
## highest: 11.327926 11.669528 12.297004 12.721926 13.026305
##
## 9.398392752 (3, 0.062), 9.594918702 (3, 0.062), 9.657860913 (3, 0.062),
## 10.67565544 (3, 0.062), 10.83370977 (3, 0.062), 10.88170258 (3, 0.062),
## 10.94476958 (3, 0.062), 11.04232751 (3, 0.062), 11.23630782 (3, 0.062),
## 11.24085004 (3, 0.062), 11.30092217 (3, 0.062), 11.32792593 (3, 0.062),
## 11.66952843 (3, 0.062), 12.29700388 (3, 0.062), 12.7219262 (3, 0.062),
## 13.02630495 (3, 0.062)
## --------------------------------------------------------------------------------
## exchange_rate
## n missing distinct Info Mean Gmd .05 .10
## 48 0 48 1 18.18 1.797 15.23 15.42
## .25 .50 .75 .90 .95
## 17.38 18.62 19.06 20.16 20.30
##
## lowest : 14.69259 14.92134 15.22618 15.22834 15.26447
## highest: 20.26117 20.29054 20.30320 20.52058 21.38527
## --------------------------------------------------------------------------------
## max_temperature
## n missing distinct Info Mean Gmd .05 .10
## 48 0 12 0.974 30.5 2.961 27.00 28.00
## .25 .50 .75 .90 .95
## 29.00 30.00 32.25 34.30 35.00
##
## lowest : 26 27 28 29 30, highest: 33 34 35 36 37
##
## Value 26 27 28 29 30 31 32 33 34 35 36
## Frequency 2 2 6 13 4 5 4 6 1 3 1
## Proportion 0.042 0.042 0.125 0.271 0.083 0.104 0.083 0.125 0.021 0.062 0.021
##
## Value 37
## Frequency 1
## Proportion 0.021
## --------------------------------------------------------------------------------
## holiday_month
## n missing distinct Info Sum Mean Gmd
## 48 0 2 0.563 12 0.25 0.383
##
## --------------------------------------------------------------------------------
summary(cocacolasales_new2)
## date sales_unitboxes consumer_sentiment CPI
## Length:48 Min. :5301755 Min. :28.67 Min. : 86.97
## Class :character 1st Qu.:6171767 1st Qu.:35.64 1st Qu.: 89.18
## Mode :character Median :6461357 Median :36.76 Median : 92.82
## Mean :6473691 Mean :37.15 Mean : 93.40
## 3rd Qu.:6819782 3rd Qu.:38.14 3rd Qu.: 98.40
## Max. :7963063 Max. :44.87 Max. :103.02
## inflation_rate unemp_rate gdp_percapita itaee
## Min. :-0.5000 Min. :0.03466 Min. :11559 Min. :103.8
## 1st Qu.: 0.1650 1st Qu.:0.04010 1st Qu.:11830 1st Qu.:111.5
## Median : 0.3850 Median :0.04369 Median :12014 Median :113.5
## Mean : 0.3485 Mean :0.04442 Mean :11979 Mean :113.9
## 3rd Qu.: 0.5575 3rd Qu.:0.04897 3rd Qu.:12162 3rd Qu.:117.1
## Max. : 1.7000 Max. :0.05517 Max. :12329 Max. :122.5
## itaee_growth pop_density job_density pop_minwage
## Min. :0.005571 Min. : 98.54 Min. :18.26 Min. : 9.398
## 1st Qu.:0.022376 1st Qu.: 99.61 1st Qu.:19.28 1st Qu.:10.794
## Median :0.029977 Median :100.67 Median :20.39 Median :11.139
## Mean :0.031736 Mean :100.65 Mean :20.38 Mean :11.116
## 3rd Qu.:0.043038 3rd Qu.:101.69 3rd Qu.:21.60 3rd Qu.:11.413
## Max. :0.056536 Max. :102.69 Max. :22.36 Max. :13.026
## exchange_rate max_temperature holiday_month
## Min. :14.69 Min. :26.00 Min. :0.00
## 1st Qu.:17.38 1st Qu.:29.00 1st Qu.:0.00
## Median :18.62 Median :30.00 Median :0.00
## Mean :18.18 Mean :30.50 Mean :0.25
## 3rd Qu.:19.06 3rd Qu.:32.25 3rd Qu.:0.25
## Max. :21.39 Max. :37.00 Max. :1.00
The ADF test is a test of stationary properties in the time series data Based on some statistical package estimates the ARIMA, SARIMA and ARIMAX models.
Stationary Test:
Our result shows that for the p-value we have is smaller than 0.05 therefore we can reject our null hypothesis and conclude our series data is stationary. This means the statistical properties of a process generating a time series do not change over time, in other words it does not mean the data doesn’t changes but the way it changes does not itself change over time. And the mean stays constant over the period of time.
adf.test(cocacolasales_new$ccsales_unit_boxes)
## Warning in adf.test(cocacolasales_new$ccsales_unit_boxes): p-value smaller than
## printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: cocacolasales_new$ccsales_unit_boxes
## Dickey-Fuller = -4.4282, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
Acf plots In our plot we see there is autocorrelation around the lags 1, 12 and 16.This shows how the time series is correlated with itself
acf(cocacolasales_new$ccsales_unit_boxes,main="Significant Autocorrelations")
Decompose a time series
Here we have a decompose of the time series, we can take notice:
The first plot we see is “observed”, which is very similar to our first visualization on this r script, that shows a pattern and a slight trend.
The first component is trend, which shows a positive linear behavior of the time series and it tends to increase seasonally
The second component represents the seasonality, the repeating patters over time, here we see a pattern where it increases a certain period of time and then there’s a low peak.
The third component is random, that shows the variability that can’t be explained in the time series, we see random fluctuations not so constant over time, there are some randoms and down peaks around 2017.
arcaipcts<-ts(cocacolasales_new$ccsales_unit_boxes,frequency=12,start=c(2015,1))
arcapcdec<-decompose(arcaipcts)
plot(arcapcdec)
The auto regressive component and moving average are statistically significant
setwd("/Users/gustavoacosta/Desktop/5 semestre/intro econometrics/datasets1")
coca1 <- read_excel("updated_cocasales_data.xlsx")
summary(ARMA.mydata<-arma(coca1$ccsales_unit_boxes, order=c(1,1)))
## Warning in arma(coca1$ccsales_unit_boxes, order = c(1, 1)): Hessian negative-
## semidefinite
## Warning in sqrt(diag(object$vcov)): NaNs produced
## Warning in sqrt(diag(object$vcov)): NaNs produced
##
## Call:
## arma(x = coca1$ccsales_unit_boxes, order = c(1, 1))
##
## Model:
## ARMA(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1146267 -335947 44609 276851 1124630
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## ar1 5.271e-01 1.059e-02 49.770 <2e-16 ***
## ma1 4.901e-03 1.422e-01 0.034 0.972
## intercept 3.083e+06 NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Fit:
## sigma^2 estimated as 2.431e+11, Conditional Sum-of-Squares = 1.118125e+13, AIC = 1400.62
plot(ARMA.mydata)
ARMA.residuals<-(ARMA.mydata$residuals)
ARMA.residuals<-na.omit(ARMA.residuals)
acf(ARMA.residuals,main="ACF - ARMA (1,1)")
Box.test(ARMA.residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: ARMA.residuals
## X-squared = 1.2684e-06, df = 1, p-value = 0.9991
setwd("/Users/gustavoacosta/Desktop/5 semestre/intro econometrics/datasets1")
coca2 <- read_excel("updated_cocasales_data.xlsx")
ARIMA.mydatar<-arima(log(coca2$ccsales_unit_boxes), order=c(1,1,1))
print(ARIMA.mydatar)
##
## Call:
## arima(x = log(coca2$ccsales_unit_boxes), order = c(1, 1, 1))
##
## Coefficients:
## ar1 ma1
## 0.5737 -0.9791
## s.e. 0.1396 0.1263
##
## sigma^2 estimated as 0.006255: log likelihood = 51.65, aic = -97.31
acf(ARIMA.mydatar$residuals,main="ACF - ARIMA (1,0.5,1)")
Box.test(ARIMA.mydatar$residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: ARIMA.mydatar$residuals
## X-squared = 0.026233, df = 1, p-value = 0.8713
adf.test(ARIMA.mydatar$residual)
## Warning in adf.test(ARIMA.mydatar$residual): p-value smaller than printed p-
## value
##
## Augmented Dickey-Fuller Test
##
## data: ARIMA.mydatar$residual
## Dickey-Fuller = -4.4059, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
setwd("/Users/gustavoacosta/Desktop/5 semestre/intro econometrics/datasets1")
coca3 <- read_excel("updated_cocasales_data.xlsx", sheet = "tseries3 - GDL" )
coca3$date=as.Date(as.yearmon(coca3$date,format="%Y/%m"))
consumer_sentiment<-ts(coca3$consumer_sentiment,start=c(2015,1),end=c(2018,12),frequency=12)
CPI<-ts(coca3$CPI,start=c(2015,1),end=c(2018,12),frequency=12)
inflation_rate<-ts(coca3$inflation_rate,start=c(2015,1),end=c(2018,12),frequency=12)
unemp_rate<-ts(coca3$unemp_rate,start=c(2015,1),end=c(2018,12),frequency=12)
gdp_percapita<-ts(coca3$gdp_percapita,start=c(2015,1),end=c(2018,12),frequency=12)
itaee<-ts(coca3$itaee,start=c(2015,1),end=c(2018,12),frequency=12)
itaee_growth<-ts(coca3$itaee_growth,start=c(2015,1),end=c(2018,12),frequency=12)
pop_density<-ts(coca3$pop_density,start=c(2015,1),end=c(2018,12),frequency=12)
job_density<-ts(coca3$job_density,start=c(2015,1),end=c(2018,12),frequency=12)
pop_minwage<-ts(coca3$pop_minwage,start=c(2015,1),end=c(2018,12),frequency=12)
exchange_rate<-ts(coca3$exchange_rate,start=c(2015,1),end=c(2018,12),frequency=12)
max_temperature<-ts(coca3$max_temperature,start=c(2015,1),end=c(2018,12),frequency=12)
holiday_month<-ts(coca3$holiday_month,start=c(2015,1),end=c(2018,12),frequency=12)
sales_unitboxes<-ts(coca3$sales_unitboxes,start=c(2015,1),end=c(2018,12),frequency=12)
Here are the plots from our independent variables with the time series data
par(mfrow=c(3,3))
plot(coca3$date,coca3$consumer_sentiment,type="l",col="blue",lwd=2,xlab="Date",ylab="consumer_sentiment",main="consumer_sentiment")
plot(coca3$date,coca3$CPI,type="l",col="blue",lwd=2,xlab="Date",ylab="CPI",main="CPI Rate")
plot(coca3$date,coca3$inflation_rate,type="l",col="blue",lwd=2,xlab="Date",ylab="Inflation",main="Inflation Rate")
plot(coca3$date,coca3$unemp_rate,type="l",col="blue",lwd=2,xlab="Date",ylab="unemp_rate",main="unemp_rate")
plot(coca3$date,coca3$gdp_percapita,type="l",col="blue",lwd=2,xlab="Date",ylab="gdp_percapita",main="gdp_percapita")
plot(coca3$date,coca3$itaee,type="l",col="blue",lwd=2,xlab="Date",ylab="itaee",main="itaee")
plot(coca3$date,coca3$gdp_percapita,type="l",col="blue",lwd=2,xlab="Date",ylab="gdp_percapita",main="gdp_percapita")
plot(coca3$date,coca3$itaee_growth,type="l",col="blue",lwd=2,xlab="Date",ylab="itaee_growth",main="itaee_growth")
plot(coca3$date,coca3$pop_density,type="l",col="blue",lwd=2,xlab="Date",ylab="pop_density",main="pop_density")
plot(coca3$date,coca3$job_density,type="l",col="blue",lwd=2,xlab="Date",ylab="job_density",main="job_density")
plot(coca3$date,coca3$pop_minwage,type="l",col="blue",lwd=2,xlab="Date",ylab="pop_minwage",main="pop_minwage")
plot(coca3$date,coca3$exchange_rate,type="l",col="blue",lwd=2,xlab="Date",ylab="exchange_rate",main="exchange_rate")
plot(coca3$date,coca3$max_temperature,type="l",col="blue",lwd=2,xlab="Date",ylab="max_temperature",main="max_temperature")
plot(coca3$date,coca3$holiday_month,type="l",col="blue",lwd=2,xlab="Date",ylab="holiday_month",main="holiday_month")
plot(coca3$date,coca3$sales_unitboxes,type="l",col="blue",lwd=2,xlab="Date",ylab="sales_unitboxes",main="sales_unitboxes")
Here we have another format for our time series plot but with mostly the same information with our independent variables
ts_plot(consumer_sentiment)
ts_plot(CPI)
ts_plot(inflation_rate)
ts_plot(unemp_rate)
ts_plot(gdp_percapita)
ts_plot(itaee)
ts_plot(itaee_growth)
ts_plot(pop_density)
ts_plot(job_density)
ts_plot(pop_minwage)
ts_plot(exchange_rate)
ts_plot(max_temperature)
ts_plot(holiday_month)
ts_plot(sales_unitboxes)
adf.test(coca3$consumer_sentiment)# non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$consumer_sentiment
## Dickey-Fuller = -0.70142, Lag order = 3, p-value = 0.9638
## alternative hypothesis: stationary
adf.test(coca3$CPI) # non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$CPI
## Dickey-Fuller = -2.4733, Lag order = 3, p-value = 0.3851
## alternative hypothesis: stationary
adf.test(coca3$inflation_rate)# non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$inflation_rate
## Dickey-Fuller = -3.2628, Lag order = 3, p-value = 0.08835
## alternative hypothesis: stationary
adf.test(coca3$unemp_rate) # non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$unemp_rate
## Dickey-Fuller = -2.2564, Lag order = 3, p-value = 0.4717
## alternative hypothesis: stationary
adf.test(coca3$gdp_percapita)# non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$gdp_percapita
## Dickey-Fuller = -2.881, Lag order = 3, p-value = 0.2223
## alternative hypothesis: stationary
adf.test(coca3$itaee)# stationary (p-value < 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$itaee
## Dickey-Fuller = -3.5209, Lag order = 3, p-value = 0.04927
## alternative hypothesis: stationary
adf.test(coca3$itaee_growth) # non stationary (p-value < 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$itaee_growth
## Dickey-Fuller = -2.8144, Lag order = 3, p-value = 0.2489
## alternative hypothesis: stationary
adf.test(coca3$pop_density)# non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$pop_density
## Dickey-Fuller = -0.56892, Lag order = 3, p-value = 0.9751
## alternative hypothesis: stationary
adf.test(coca3$job_density)# non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$job_density
## Dickey-Fuller = -1.6524, Lag order = 3, p-value = 0.713
## alternative hypothesis: stationary
adf.test(coca3$pop_minwage) # non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$pop_minwage
## Dickey-Fuller = -3.117, Lag order = 3, p-value = 0.128
## alternative hypothesis: stationary
adf.test(coca3$exchange_rate)# non-stationary (p-value > 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$exchange_rate
## Dickey-Fuller = -1.9944, Lag order = 3, p-value = 0.5764
## alternative hypothesis: stationary
adf.test(coca3$max_temperature) # stationary (p-value < 0.05)
##
## Augmented Dickey-Fuller Test
##
## data: coca3$max_temperature
## Dickey-Fuller = -3.5429, Lag order = 3, p-value = 0.04747
## alternative hypothesis: stationary
adf.test(coca3$holiday_month) # stationary (p-value < 0.05)
## Warning in adf.test(coca3$holiday_month): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: coca3$holiday_month
## Dickey-Fuller = -4.6219, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
adf.test(coca3$sales_unitboxes)# stationary (p-value < 0.05)
## Warning in adf.test(coca3$sales_unitboxes): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: coca3$sales_unitboxes
## Dickey-Fuller = -4.4282, Lag order = 3, p-value = 0.01
## alternative hypothesis: stationary
var_tseries1<-cbind(sales_unitboxes,max_temperature,itaee_growth,pop_minwage,consumer_sentiment)
colnames(var_tseries1)<-cbind("sales_unitboxes","max_temperature","itaee_growth","pop_minwage","consumer_sentiment")
lagselect1<-VARselect(var_tseries1,lag.max=5,type="const")
lagselect1$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 5 1 1 1
lagselect1$criteria
## 1 2 3 4 5
## AIC(n) 1.731302e+01 1.766279e+01 1.776954e+01 1.747702e+01 1.681127e+01
## HQ(n) 1.776614e+01 1.849351e+01 1.897787e+01 1.906295e+01 1.877480e+01
## SC(n) 1.854176e+01 1.991548e+01 2.104619e+01 2.177762e+01 2.213583e+01
## FPE(n) 3.333708e+07 4.966878e+07 6.290342e+07 6.137404e+07 5.212851e+07
var_model1<-VAR(var_tseries1,p=1,type="const",season=NULL,exog=NULL)
summary(var_model1)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: sales_unitboxes, max_temperature, itaee_growth, pop_minwage, consumer_sentiment
## Deterministic variables: const
## Sample size: 47
## Log Likelihood: -707.645
## Roots of the characteristic polynomial:
## 0.8681 0.8681 0.6131 0.485 0.485
## Call:
## VAR(y = var_tseries1, p = 1, type = "const", exogen = NULL)
##
##
## Estimation results for equation sales_unitboxes:
## ================================================
## sales_unitboxes = sales_unitboxes.l1 + max_temperature.l1 + itaee_growth.l1 + pop_minwage.l1 + consumer_sentiment.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unitboxes.l1 5.191e-02 1.422e-01 0.365 0.7170
## max_temperature.l1 1.504e+05 3.146e+04 4.781 2.26e-05 ***
## itaee_growth.l1 1.293e+06 4.426e+06 0.292 0.7717
## pop_minwage.l1 1.271e+05 6.494e+04 1.958 0.0571 .
## consumer_sentiment.l1 4.265e+04 2.515e+04 1.696 0.0975 .
## const -1.475e+06 1.432e+06 -1.030 0.3089
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 407000 on 41 degrees of freedom
## Multiple R-Squared: 0.5731, Adjusted R-squared: 0.521
## F-statistic: 11.01 on 5 and 41 DF, p-value: 9.201e-07
##
##
## Estimation results for equation max_temperature:
## ================================================
## max_temperature = sales_unitboxes.l1 + max_temperature.l1 + itaee_growth.l1 + pop_minwage.l1 + consumer_sentiment.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unitboxes.l1 -7.331e-07 6.795e-07 -1.079 0.2869
## max_temperature.l1 7.179e-01 1.503e-01 4.776 2.3e-05 ***
## itaee_growth.l1 2.455e+01 2.114e+01 1.161 0.2522
## pop_minwage.l1 3.730e-01 3.103e-01 1.202 0.2361
## consumer_sentiment.l1 -1.601e-01 1.201e-01 -1.333 0.1900
## const 1.433e+01 6.840e+00 2.095 0.0424 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 1.945 on 41 degrees of freedom
## Multiple R-Squared: 0.5149, Adjusted R-squared: 0.4557
## F-statistic: 8.703 on 5 and 41 DF, p-value: 1.092e-05
##
##
## Estimation results for equation itaee_growth:
## =============================================
## itaee_growth = sales_unitboxes.l1 + max_temperature.l1 + itaee_growth.l1 + pop_minwage.l1 + consumer_sentiment.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unitboxes.l1 1.553e-09 4.304e-09 0.361 0.720145
## max_temperature.l1 -2.605e-04 9.522e-04 -0.274 0.785779
## itaee_growth.l1 5.609e-01 1.339e-01 4.188 0.000146 ***
## pop_minwage.l1 -1.290e-03 1.965e-03 -0.656 0.515397
## consumer_sentiment.l1 -3.534e-04 7.611e-04 -0.464 0.644815
## const 3.876e-02 4.333e-02 0.894 0.376306
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.01232 on 41 degrees of freedom
## Multiple R-Squared: 0.3597, Adjusted R-squared: 0.2816
## F-statistic: 4.606 on 5 and 41 DF, p-value: 0.001997
##
##
## Estimation results for equation pop_minwage:
## ============================================
## pop_minwage = sales_unitboxes.l1 + max_temperature.l1 + itaee_growth.l1 + pop_minwage.l1 + consumer_sentiment.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unitboxes.l1 2.832e-07 1.084e-07 2.611 0.01254 *
## max_temperature.l1 -8.494e-02 2.399e-02 -3.541 0.00101 **
## itaee_growth.l1 -1.338e+00 3.374e+00 -0.397 0.69376
## pop_minwage.l1 8.964e-01 4.951e-02 18.104 < 2e-16 ***
## consumer_sentiment.l1 -4.429e-02 1.917e-02 -2.310 0.02601 *
## const 3.640e+00 1.092e+00 3.335 0.00182 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.3103 on 41 degrees of freedom
## Multiple R-Squared: 0.9127, Adjusted R-squared: 0.902
## F-statistic: 85.68 on 5 and 41 DF, p-value: < 2.2e-16
##
##
## Estimation results for equation consumer_sentiment:
## ===================================================
## consumer_sentiment = sales_unitboxes.l1 + max_temperature.l1 + itaee_growth.l1 + pop_minwage.l1 + consumer_sentiment.l1 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unitboxes.l1 -1.106e-06 5.480e-07 -2.017 0.05024 .
## max_temperature.l1 3.367e-01 1.212e-01 2.777 0.00823 **
## itaee_growth.l1 -4.648e+00 1.705e+01 -0.273 0.78654
## pop_minwage.l1 4.345e-01 2.502e-01 1.736 0.09001 .
## consumer_sentiment.l1 1.004e+00 9.689e-02 10.365 5.07e-13 ***
## const -7.836e+00 5.516e+00 -1.420 0.16304
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 1.568 on 41 degrees of freedom
## Multiple R-Squared: 0.7424, Adjusted R-squared: 0.711
## F-statistic: 23.63 on 5 and 41 DF, p-value: 4.202e-11
##
##
##
## Covariance matrix of residuals:
## sales_unitboxes max_temperature itaee_growth pop_minwage
## sales_unitboxes 1.657e+11 2.527e+05 -1.115e+03 1.010e+04
## max_temperature 2.527e+05 3.782e+00 -1.919e-03 4.525e-02
## itaee_growth -1.115e+03 -1.919e-03 1.518e-04 -7.279e-04
## pop_minwage 1.010e+04 4.525e-02 -7.279e-04 9.631e-02
## consumer_sentiment 5.019e+04 -6.989e-01 -2.138e-03 -7.704e-02
## consumer_sentiment
## sales_unitboxes 5.019e+04
## max_temperature -6.989e-01
## itaee_growth -2.138e-03
## pop_minwage -7.704e-02
## consumer_sentiment 2.460e+00
##
## Correlation matrix of residuals:
## sales_unitboxes max_temperature itaee_growth pop_minwage
## sales_unitboxes 1.00000 0.31925 -0.22240 0.07997
## max_temperature 0.31925 1.00000 -0.08012 0.07497
## itaee_growth -0.22240 -0.08012 1.00000 -0.19040
## pop_minwage 0.07997 0.07497 -0.19040 1.00000
## consumer_sentiment 0.07863 -0.22916 -0.11068 -0.15828
## consumer_sentiment
## sales_unitboxes 0.07863
## max_temperature -0.22916
## itaee_growth -0.11068
## pop_minwage -0.15828
## consumer_sentiment 1.00000
Granger causality testing each variable against all the others.
granger_coca<-causality(var_model1,cause="sales_unitboxes")
granger_coca
## $Granger
##
## Granger causality H0: sales_unitboxes do not Granger-cause
## max_temperature itaee_growth pop_minwage consumer_sentiment
##
## data: VAR object var_model1
## F-Test = 3.1251, df1 = 4, df2 = 205, p-value = 0.01597
##
##
## $Instant
##
## H0: No instantaneous causality between: sales_unitboxes and
## max_temperature itaee_growth pop_minwage consumer_sentiment
##
## data: VAR object var_model1
## Chi-squared = 6.4797, df = 4, p-value = 0.1661
Transform non-stationary time series variables The number of lags that will minimize our AIC statistics is 2
diff_sales_unitboxes <- diff(log(sales_unitboxes))
diff_itaee_growth<-diff(log(itaee_growth))
diff_unemp_rate<-diff(log(unemp_rate))
diff_consumer_sentiment<-diff(log(consumer_sentiment))
diff_max_temperature <- diff(log(max_temperature))
var_tseries2<-cbind(diff_sales_unitboxes, diff_itaee_growth,diff_unemp_rate,diff_consumer_sentiment,diff_max_temperature)
colnames(var_tseries2)<-cbind("sales_unit_boxes", "itaee_growth","unemp_rate","consumer_sentiment","max_temperature")
lagselect2<-VARselect(var_tseries2,lag.max=5,type="const")
lagselect2$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 5 1 1 5
lagselect2$criteria
## 1 2 3 4 5
## AIC(n) -2.195882e+01 -2.147906e+01 -2.182207e+01 -2.182334e+01 -2.310243e+01
## HQ(n) -2.150388e+01 -2.064499e+01 -2.060888e+01 -2.023103e+01 -2.113100e+01
## SC(n) -2.071763e+01 -1.920354e+01 -1.851223e+01 -1.747917e+01 -1.772393e+01
## FPE(n) 2.935464e-10 4.999133e-10 4.079544e-10 5.449744e-10 2.631452e-10
Specify model
var_model2<-VAR(var_tseries2,p=2,type="const",season=NULL,exog=NULL)
summary(var_model2)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: sales_unit_boxes, itaee_growth, unemp_rate, consumer_sentiment, max_temperature
## Deterministic variables: const
## Sample size: 45
## Log Likelihood: 226.356
## Roots of the characteristic polynomial:
## 0.6449 0.6449 0.6279 0.6279 0.5687 0.5687 0.4648 0.4648 0.3648 0.3648
## Call:
## VAR(y = var_tseries2, p = 2, type = "const", exogen = NULL)
##
##
## Estimation results for equation sales_unit_boxes:
## =================================================
## sales_unit_boxes = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 -0.563053 0.177327 -3.175 0.00318 **
## itaee_growth.l1 0.002355 0.020010 0.118 0.90701
## unemp_rate.l1 0.040559 0.149473 0.271 0.78777
## consumer_sentiment.l1 0.678038 0.279016 2.430 0.02052 *
## max_temperature.l1 0.675570 0.200245 3.374 0.00187 **
## sales_unit_boxes.l2 -0.195086 0.165309 -1.180 0.24614
## itaee_growth.l2 -0.002210 0.020537 -0.108 0.91492
## unemp_rate.l2 -0.182797 0.143861 -1.271 0.21248
## consumer_sentiment.l2 -0.190435 0.302412 -0.630 0.53309
## max_temperature.l2 0.443002 0.208280 2.127 0.04076 *
## const 0.003901 0.011480 0.340 0.73607
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.07556 on 34 degrees of freedom
## Multiple R-Squared: 0.4546, Adjusted R-squared: 0.2942
## F-statistic: 2.834 on 10 and 34 DF, p-value: 0.01132
##
##
## Estimation results for equation itaee_growth:
## =============================================
## itaee_growth = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 1.692690 1.529816 1.106 0.276
## itaee_growth.l1 -0.006317 0.172632 -0.037 0.971
## unemp_rate.l1 0.849594 1.289514 0.659 0.514
## consumer_sentiment.l1 -2.576313 2.407091 -1.070 0.292
## max_temperature.l1 -2.075133 1.727528 -1.201 0.238
## sales_unit_boxes.l2 0.615984 1.426133 0.432 0.669
## itaee_growth.l2 -0.036686 0.177173 -0.207 0.837
## unemp_rate.l2 0.154218 1.241101 0.124 0.902
## consumer_sentiment.l2 -3.120630 2.608926 -1.196 0.240
## max_temperature.l2 -0.545550 1.796842 -0.304 0.763
## const -0.008557 0.099042 -0.086 0.932
##
##
## Residual standard error: 0.6519 on 34 degrees of freedom
## Multiple R-Squared: 0.08297, Adjusted R-squared: -0.1867
## F-statistic: 0.3076 on 10 and 34 DF, p-value: 0.974
##
##
## Estimation results for equation unemp_rate:
## ===========================================
## unemp_rate = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 0.268492 0.189143 1.420 0.16485
## itaee_growth.l1 0.031321 0.021344 1.467 0.15145
## unemp_rate.l1 -0.514789 0.159433 -3.229 0.00275 **
## consumer_sentiment.l1 0.605996 0.297607 2.036 0.04958 *
## max_temperature.l1 -0.046973 0.213588 -0.220 0.82725
## sales_unit_boxes.l2 0.318124 0.176324 1.804 0.08007 .
## itaee_growth.l2 0.006572 0.021905 0.300 0.76600
## unemp_rate.l2 -0.365214 0.153447 -2.380 0.02306 *
## consumer_sentiment.l2 -0.045120 0.322562 -0.140 0.88958
## max_temperature.l2 -0.347949 0.222157 -1.566 0.12656
## const -0.011659 0.012245 -0.952 0.34776
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.08059 on 34 degrees of freedom
## Multiple R-Squared: 0.3981, Adjusted R-squared: 0.221
## F-statistic: 2.249 on 10 and 34 DF, p-value: 0.03839
##
##
## Estimation results for equation consumer_sentiment:
## ===================================================
## consumer_sentiment = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 -0.2018097 0.1122488 -1.798 0.0811 .
## itaee_growth.l1 -0.0101777 0.0126667 -0.803 0.4273
## unemp_rate.l1 0.1005459 0.0946168 1.063 0.2954
## consumer_sentiment.l1 -0.0126064 0.1766179 -0.071 0.9435
## max_temperature.l1 0.1862236 0.1267557 1.469 0.1510
## sales_unit_boxes.l2 -0.0026797 0.1046411 -0.026 0.9797
## itaee_growth.l2 0.0004688 0.0129999 0.036 0.9714
## unemp_rate.l2 -0.0909286 0.0910646 -0.999 0.3251
## consumer_sentiment.l2 0.0090747 0.1914274 0.047 0.9625
## max_temperature.l2 0.2573765 0.1318415 1.952 0.0592 .
## const 0.0042831 0.0072671 0.589 0.5595
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.04783 on 34 degrees of freedom
## Multiple R-Squared: 0.2566, Adjusted R-squared: 0.03792
## F-statistic: 1.173 on 10 and 34 DF, p-value: 0.3417
##
##
## Estimation results for equation max_temperature:
## ================================================
## max_temperature = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 0.085163 0.170127 0.501 0.620
## itaee_growth.l1 0.011969 0.019198 0.623 0.537
## unemp_rate.l1 0.098539 0.143404 0.687 0.497
## consumer_sentiment.l1 -0.064109 0.267687 -0.239 0.812
## max_temperature.l1 0.137291 0.192114 0.715 0.480
## sales_unit_boxes.l2 -0.012406 0.158597 -0.078 0.938
## itaee_growth.l2 0.014476 0.019703 0.735 0.468
## unemp_rate.l2 -0.109129 0.138020 -0.791 0.435
## consumer_sentiment.l2 -0.264774 0.290132 -0.913 0.368
## max_temperature.l2 -0.159065 0.199822 -0.796 0.432
## const -0.001181 0.011014 -0.107 0.915
##
##
## Residual standard error: 0.07249 on 34 degrees of freedom
## Multiple R-Squared: 0.1656, Adjusted R-squared: -0.07978
## F-statistic: 0.6749 on 10 and 34 DF, p-value: 0.7395
##
##
##
## Covariance matrix of residuals:
## sales_unit_boxes itaee_growth unemp_rate consumer_sentiment
## sales_unit_boxes 0.005709 -0.013349 0.0016036 0.0004540
## itaee_growth -0.013349 0.424914 -0.0034855 -0.0024233
## unemp_rate 0.001604 -0.003485 0.0064954 0.0004427
## consumer_sentiment 0.000454 -0.002423 0.0004427 0.0022876
## max_temperature 0.002068 -0.008945 0.0009411 -0.0008576
## max_temperature
## sales_unit_boxes 0.0020682
## itaee_growth -0.0089447
## unemp_rate 0.0009411
## consumer_sentiment -0.0008576
## max_temperature 0.0052550
##
## Correlation matrix of residuals:
## sales_unit_boxes itaee_growth unemp_rate consumer_sentiment
## sales_unit_boxes 1.0000 -0.27103 0.26334 0.12563
## itaee_growth -0.2710 1.00000 -0.06635 -0.07772
## unemp_rate 0.2633 -0.06635 1.00000 0.11484
## consumer_sentiment 0.1256 -0.07772 0.11484 1.00000
## max_temperature 0.3776 -0.18929 0.16109 -0.24734
## max_temperature
## sales_unit_boxes 0.3776
## itaee_growth -0.1893
## unemp_rate 0.1611
## consumer_sentiment -0.2473
## max_temperature 1.0000
granger_coca1<-causality(var_model2,cause="sales_unit_boxes")
granger_coca1
## $Granger
##
## Granger causality H0: sales_unit_boxes do not Granger-cause
## itaee_growth unemp_rate consumer_sentiment max_temperature
##
## data: VAR object var_model2
## F-Test = 1.1727, df1 = 8, df2 = 170, p-value = 0.3183
##
##
## $Instant
##
## H0: No instantaneous causality between: sales_unit_boxes and
## itaee_growth unemp_rate consumer_sentiment max_temperature
##
## data: VAR object var_model2
## Chi-squared = 9.0746, df = 4, p-value = 0.05926
For the selection of the best model we will take in consideration the akaike information criterion and some model diagnostics like the L-Jung Box Test, R squared and the number of statistically significant variables but only when the diagnostics apply to the model.
Firstly we will compare L-Jung Box Test for the model of ARMA and ARIMA. ARMA model: 0.9991 ARIMA model: 0.8713 In both of our models fail to reject our null hypothesis , since the p-value is <0.05, concluding that our model does not show lack of fit
Box.test(ARMA.residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: ARMA.residuals
## X-squared = 1.2684e-06, df = 1, p-value = 0.9991
Box.test(ARIMA.mydatar$residuals,lag=1,type="Ljung-Box")
##
## Box-Ljung test
##
## data: ARIMA.mydatar$residuals
## X-squared = 0.026233, df = 1, p-value = 0.8713
Here we will be evaluating the results of the akaike information criterion (AIC) for each of our models Arma, Arima and Var (with and without logarithm):
Model 1 ARMA(1,1): 1400.62
Model 2 ARIMA (1,1,1) : -97.31
Model 3 VAR(no log) : 1.73e+01
Model 3.1 VAR(log): -2.14e+01
The model with lowest AIC is model VAR 3.1 that includes a logarithmic function, however to choose the model that fits the best we will take in consideration the R squared
R SQUARED
MODEL 3 VAR: 29.4%
MODEL 3.1 VAR: 52.1%
Taking this in consideration and that bot hav statistically signifcant variables that affect sales over the period of time we will choose Model 3.1 VAR (with log) as the best model. Because the r squared is bigger that means the variance for a dependent variable that’s explained by an independent variable.
Here we have our first plots where we can see a plot for ach one of the independent and dependent variable and it’s behavior over a certain period of time (2015-2018). The variables where we can observe trends and th data is non stationary are:
and the variables where we see a constant mean over a period of time and stationary time series data are variables:
The most compelling plots from our variables are :
par(mfrow=c(3,3))
plot(coca3$date,coca3$consumer_sentiment,type="l",col="blue",lwd=2,xlab="Date",ylab="consumer_sentiment",main="consumer_sentiment")
plot(coca3$date,coca3$CPI,type="l",col="blue",lwd=2,xlab="Date",ylab="CPI",main="CPI Rate")
plot(coca3$date,coca3$inflation_rate,type="l",col="blue",lwd=2,xlab="Date",ylab="Inflation",main="Inflation Rate")
plot(coca3$date,coca3$unemp_rate,type="l",col="blue",lwd=2,xlab="Date",ylab="unemp_rate",main="unemp_rate")
plot(coca3$date,coca3$gdp_percapita,type="l",col="blue",lwd=2,xlab="Date",ylab="gdp_percapita",main="gdp_percapita")
plot(coca3$date,coca3$itaee,type="l",col="blue",lwd=2,xlab="Date",ylab="itaee",main="itaee")
plot(coca3$date,coca3$gdp_percapita,type="l",col="blue",lwd=2,xlab="Date",ylab="gdp_percapita",main="gdp_percapita")
plot(coca3$date,coca3$itaee_growth,type="l",col="blue",lwd=2,xlab="Date",ylab="itaee_growth",main="itaee_growth")
plot(coca3$date,coca3$pop_density,type="l",col="blue",lwd=2,xlab="Date",ylab="pop_density",main="pop_density")
plot(coca3$date,coca3$job_density,type="l",col="blue",lwd=2,xlab="Date",ylab="job_density",main="job_density")
plot(coca3$date,coca3$pop_minwage,type="l",col="blue",lwd=2,xlab="Date",ylab="pop_minwage",main="pop_minwage")
plot(coca3$date,coca3$exchange_rate,type="l",col="blue",lwd=2,xlab="Date",ylab="exchange_rate",main="exchange_rate")
plot(coca3$date,coca3$max_temperature,type="l",col="blue",lwd=2,xlab="Date",ylab="max_temperature",main="max_temperature")
plot(coca3$date,coca3$holiday_month,type="l",col="blue",lwd=2,xlab="Date",ylab="holiday_month",main="holiday_month")
plot(coca3$date,coca3$sales_unitboxes,type="l",col="blue",lwd=2,xlab="Date",ylab="sales_unitboxes",main="sales_unitboxes")
Here we have a more specific graph, to observe the behavior of the sales from 2015 to 2018, we can clearly observe a pattern seasonally for this stationary component, most of the low peaks from the sales unit boxes are around the beginning of the year, meanwhile the highest peaks are around the half of the year.
ts_plot(sales_unitboxes)
It is important to assess whether the variables under study are stationary or not, as we mention earlier we only have 4 stationary variables while the other 10 are non stationary.
For our model we chose 5 variables: - sales_unitboxes - max_temperature - itaee_growth - pop_minwage - consumer_sentiment
and we chose this variables because of the patterns we saw in our earlier plots, since they were the most compelling at first sight.Then we did and adf. test in order to asses which were our stationary and non stationary data In order to have all our variables stationary we used added a logarithmic function in order that the statistical properties of the system do not change over time. With our results we saw that the lags we would use for our model would be 2, since we consider 5 lags to be to big to analyze.
lagselect2<-VARselect(var_tseries2,lag.max=5,type="const")
lagselect2$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 5 1 1 5
lagselect2$criteria
## 1 2 3 4 5
## AIC(n) -2.195882e+01 -2.147906e+01 -2.182207e+01 -2.182334e+01 -2.310243e+01
## HQ(n) -2.150388e+01 -2.064499e+01 -2.060888e+01 -2.023103e+01 -2.113100e+01
## SC(n) -2.071763e+01 -1.920354e+01 -1.851223e+01 -1.747917e+01 -1.772393e+01
## FPE(n) 2.935464e-10 4.999133e-10 4.079544e-10 5.449744e-10 2.631452e-10
var_model2<-VAR(var_tseries2,p=2,type="const",season=NULL,exog=NULL)
summary(var_model2)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: sales_unit_boxes, itaee_growth, unemp_rate, consumer_sentiment, max_temperature
## Deterministic variables: const
## Sample size: 45
## Log Likelihood: 226.356
## Roots of the characteristic polynomial:
## 0.6449 0.6449 0.6279 0.6279 0.5687 0.5687 0.4648 0.4648 0.3648 0.3648
## Call:
## VAR(y = var_tseries2, p = 2, type = "const", exogen = NULL)
##
##
## Estimation results for equation sales_unit_boxes:
## =================================================
## sales_unit_boxes = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 -0.563053 0.177327 -3.175 0.00318 **
## itaee_growth.l1 0.002355 0.020010 0.118 0.90701
## unemp_rate.l1 0.040559 0.149473 0.271 0.78777
## consumer_sentiment.l1 0.678038 0.279016 2.430 0.02052 *
## max_temperature.l1 0.675570 0.200245 3.374 0.00187 **
## sales_unit_boxes.l2 -0.195086 0.165309 -1.180 0.24614
## itaee_growth.l2 -0.002210 0.020537 -0.108 0.91492
## unemp_rate.l2 -0.182797 0.143861 -1.271 0.21248
## consumer_sentiment.l2 -0.190435 0.302412 -0.630 0.53309
## max_temperature.l2 0.443002 0.208280 2.127 0.04076 *
## const 0.003901 0.011480 0.340 0.73607
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.07556 on 34 degrees of freedom
## Multiple R-Squared: 0.4546, Adjusted R-squared: 0.2942
## F-statistic: 2.834 on 10 and 34 DF, p-value: 0.01132
##
##
## Estimation results for equation itaee_growth:
## =============================================
## itaee_growth = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 1.692690 1.529816 1.106 0.276
## itaee_growth.l1 -0.006317 0.172632 -0.037 0.971
## unemp_rate.l1 0.849594 1.289514 0.659 0.514
## consumer_sentiment.l1 -2.576313 2.407091 -1.070 0.292
## max_temperature.l1 -2.075133 1.727528 -1.201 0.238
## sales_unit_boxes.l2 0.615984 1.426133 0.432 0.669
## itaee_growth.l2 -0.036686 0.177173 -0.207 0.837
## unemp_rate.l2 0.154218 1.241101 0.124 0.902
## consumer_sentiment.l2 -3.120630 2.608926 -1.196 0.240
## max_temperature.l2 -0.545550 1.796842 -0.304 0.763
## const -0.008557 0.099042 -0.086 0.932
##
##
## Residual standard error: 0.6519 on 34 degrees of freedom
## Multiple R-Squared: 0.08297, Adjusted R-squared: -0.1867
## F-statistic: 0.3076 on 10 and 34 DF, p-value: 0.974
##
##
## Estimation results for equation unemp_rate:
## ===========================================
## unemp_rate = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 0.268492 0.189143 1.420 0.16485
## itaee_growth.l1 0.031321 0.021344 1.467 0.15145
## unemp_rate.l1 -0.514789 0.159433 -3.229 0.00275 **
## consumer_sentiment.l1 0.605996 0.297607 2.036 0.04958 *
## max_temperature.l1 -0.046973 0.213588 -0.220 0.82725
## sales_unit_boxes.l2 0.318124 0.176324 1.804 0.08007 .
## itaee_growth.l2 0.006572 0.021905 0.300 0.76600
## unemp_rate.l2 -0.365214 0.153447 -2.380 0.02306 *
## consumer_sentiment.l2 -0.045120 0.322562 -0.140 0.88958
## max_temperature.l2 -0.347949 0.222157 -1.566 0.12656
## const -0.011659 0.012245 -0.952 0.34776
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.08059 on 34 degrees of freedom
## Multiple R-Squared: 0.3981, Adjusted R-squared: 0.221
## F-statistic: 2.249 on 10 and 34 DF, p-value: 0.03839
##
##
## Estimation results for equation consumer_sentiment:
## ===================================================
## consumer_sentiment = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 -0.2018097 0.1122488 -1.798 0.0811 .
## itaee_growth.l1 -0.0101777 0.0126667 -0.803 0.4273
## unemp_rate.l1 0.1005459 0.0946168 1.063 0.2954
## consumer_sentiment.l1 -0.0126064 0.1766179 -0.071 0.9435
## max_temperature.l1 0.1862236 0.1267557 1.469 0.1510
## sales_unit_boxes.l2 -0.0026797 0.1046411 -0.026 0.9797
## itaee_growth.l2 0.0004688 0.0129999 0.036 0.9714
## unemp_rate.l2 -0.0909286 0.0910646 -0.999 0.3251
## consumer_sentiment.l2 0.0090747 0.1914274 0.047 0.9625
## max_temperature.l2 0.2573765 0.1318415 1.952 0.0592 .
## const 0.0042831 0.0072671 0.589 0.5595
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.04783 on 34 degrees of freedom
## Multiple R-Squared: 0.2566, Adjusted R-squared: 0.03792
## F-statistic: 1.173 on 10 and 34 DF, p-value: 0.3417
##
##
## Estimation results for equation max_temperature:
## ================================================
## max_temperature = sales_unit_boxes.l1 + itaee_growth.l1 + unemp_rate.l1 + consumer_sentiment.l1 + max_temperature.l1 + sales_unit_boxes.l2 + itaee_growth.l2 + unemp_rate.l2 + consumer_sentiment.l2 + max_temperature.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## sales_unit_boxes.l1 0.085163 0.170127 0.501 0.620
## itaee_growth.l1 0.011969 0.019198 0.623 0.537
## unemp_rate.l1 0.098539 0.143404 0.687 0.497
## consumer_sentiment.l1 -0.064109 0.267687 -0.239 0.812
## max_temperature.l1 0.137291 0.192114 0.715 0.480
## sales_unit_boxes.l2 -0.012406 0.158597 -0.078 0.938
## itaee_growth.l2 0.014476 0.019703 0.735 0.468
## unemp_rate.l2 -0.109129 0.138020 -0.791 0.435
## consumer_sentiment.l2 -0.264774 0.290132 -0.913 0.368
## max_temperature.l2 -0.159065 0.199822 -0.796 0.432
## const -0.001181 0.011014 -0.107 0.915
##
##
## Residual standard error: 0.07249 on 34 degrees of freedom
## Multiple R-Squared: 0.1656, Adjusted R-squared: -0.07978
## F-statistic: 0.6749 on 10 and 34 DF, p-value: 0.7395
##
##
##
## Covariance matrix of residuals:
## sales_unit_boxes itaee_growth unemp_rate consumer_sentiment
## sales_unit_boxes 0.005709 -0.013349 0.0016036 0.0004540
## itaee_growth -0.013349 0.424914 -0.0034855 -0.0024233
## unemp_rate 0.001604 -0.003485 0.0064954 0.0004427
## consumer_sentiment 0.000454 -0.002423 0.0004427 0.0022876
## max_temperature 0.002068 -0.008945 0.0009411 -0.0008576
## max_temperature
## sales_unit_boxes 0.0020682
## itaee_growth -0.0089447
## unemp_rate 0.0009411
## consumer_sentiment -0.0008576
## max_temperature 0.0052550
##
## Correlation matrix of residuals:
## sales_unit_boxes itaee_growth unemp_rate consumer_sentiment
## sales_unit_boxes 1.0000 -0.27103 0.26334 0.12563
## itaee_growth -0.2710 1.00000 -0.06635 -0.07772
## unemp_rate 0.2633 -0.06635 1.00000 0.11484
## consumer_sentiment 0.1256 -0.07772 0.11484 1.00000
## max_temperature 0.3776 -0.18929 0.16109 -0.24734
## max_temperature
## sales_unit_boxes 0.3776
## itaee_growth -0.1893
## unemp_rate 0.1611
## consumer_sentiment -0.2473
## max_temperature 1.0000
granger_coca1<-causality(var_model2,cause="sales_unit_boxes")
granger_coca1
## $Granger
##
## Granger causality H0: sales_unit_boxes do not Granger-cause
## itaee_growth unemp_rate consumer_sentiment max_temperature
##
## data: VAR object var_model2
## F-Test = 1.1727, df1 = 8, df2 = 170, p-value = 0.3183
##
##
## $Instant
##
## H0: No instantaneous causality between: sales_unit_boxes and
## itaee_growth unemp_rate consumer_sentiment max_temperature
##
## data: VAR object var_model2
## Chi-squared = 9.0746, df = 4, p-value = 0.05926
Finally for our result analysis we have the forecasting for the next year of twelve months for the sales unit boxes from our Vector Auto regression Model. Where in our graphic represents the grey output, we see a pattern where the sales will increase an then decrease but stabilize by the end of the year.
However in our chart we can conclude we will expect the biggest number of sales for March and the lowest for February.
Most of the negative impact on sales are around the beginning and the middle of the year
The unit sales boxes tend to stabilize by the end of the year, out of 12 months of the year 5 will have a negative expectation for sales
forecast<-predict(var_model2,n.ahead=12,ci=0.95) ### forecast for the next year
fanchart(forecast,names="sales_unit_boxes",main="Sales unit boxes",xlab="Time Period",ylab="sales")
forecast
## $sales_unit_boxes
## fcst lower upper CI
## [1,] -0.0254134123 -0.1735067 0.1226798 0.1480933
## [2,] -0.0698473457 -0.2481691 0.1084744 0.1783217
## [3,] 0.0263030809 -0.1686402 0.2212463 0.1949433
## [4,] 0.0081452678 -0.1908338 0.2071243 0.1989791
## [5,] -0.0062007142 -0.2069735 0.1945721 0.2007728
## [6,] 0.0136862304 -0.1874639 0.2148363 0.2011501
## [7,] -0.0014109828 -0.2028281 0.2000061 0.2014171
## [8,] -0.0001866577 -0.2017088 0.2013355 0.2015222
## [9,] 0.0038125135 -0.1977626 0.2053876 0.2015751
## [10,] 0.0003678387 -0.2012161 0.2019518 0.2015839
## [11,] 0.0020811909 -0.1995076 0.2036699 0.2015887
## [12,] 0.0024052560 -0.1991866 0.2039971 0.2015919
##
## $itaee_growth
## fcst lower upper CI
## [1,] 0.190528304 -1.087083 1.468139 1.277611
## [2,] -0.154844316 -1.477634 1.167945 1.322790
## [3,] -0.006394354 -1.339138 1.326349 1.332744
## [4,] -0.007579971 -1.341272 1.326113 1.333693
## [5,] -0.004343960 -1.338950 1.330262 1.334606
## [6,] -0.006528384 -1.341883 1.328826 1.335354
## [7,] -0.037560637 -1.373002 1.297881 1.335441
## [8,] -0.013161167 -1.348867 1.322545 1.335706
## [9,] -0.016633990 -1.352372 1.319104 1.335738
## [10,] -0.022969441 -1.358738 1.312800 1.335769
## [11,] -0.016838989 -1.352618 1.318940 1.335779
## [12,] -0.017657425 -1.353438 1.318123 1.335780
##
## $unemp_rate
## fcst lower upper CI
## [1,] -0.0141177180 -0.1720786 0.1438432 0.1579609
## [2,] -0.0054375867 -0.1933223 0.1824471 0.1878847
## [3,] -0.0448636810 -0.2352509 0.1455236 0.1903873
## [4,] 0.0153818374 -0.1829245 0.2136881 0.1983063
## [5,] 0.0002906435 -0.1993930 0.1999743 0.1996836
## [6,] -0.0120620725 -0.2131676 0.1890434 0.2011055
## [7,] -0.0020407195 -0.2034432 0.1993618 0.2014025
## [8,] -0.0025407442 -0.2040421 0.1989606 0.2015013
## [9,] -0.0063143076 -0.2078630 0.1952344 0.2015487
## [10,] -0.0046910778 -0.2062508 0.1968686 0.2015597
## [11,] -0.0043937429 -0.2059582 0.1971707 0.2015644
## [12,] -0.0045825815 -0.2061480 0.1969829 0.2015654
##
## $consumer_sentiment
## fcst lower upper CI
## [1,] -0.017940867 -0.11168433 0.07580259 0.09374346
## [2,] -0.020359756 -0.12064282 0.07992330 0.10028306
## [3,] 0.017384947 -0.08971254 0.12448243 0.10709748
## [4,] -0.010596425 -0.11855438 0.09736153 0.10795796
## [5,] 0.008701144 -0.09987355 0.11727584 0.10857470
## [6,] 0.005137413 -0.10386716 0.11414199 0.10900457
## [7,] 0.000641724 -0.10851596 0.10979941 0.10915769
## [8,] 0.004606393 -0.10458598 0.11379877 0.10919237
## [9,] 0.002666408 -0.10653915 0.11187197 0.10920556
## [10,] 0.002216835 -0.10699598 0.11142964 0.10921281
## [11,] 0.003474322 -0.10574306 0.11269170 0.10921738
## [12,] 0.002807770 -0.10641056 0.11202610 0.10921833
##
## $max_temperature
## fcst lower upper CI
## [1,] 0.0003296785 -0.1417503 0.1424096 0.1420799
## [2,] -0.0195629344 -0.1658784 0.1267525 0.1463154
## [3,] -0.0015873278 -0.1547730 0.1515983 0.1531857
## [4,] 0.0029501463 -0.1516793 0.1575796 0.1546294
## [5,] 0.0021484291 -0.1527186 0.1570155 0.1548670
## [6,] -0.0015482753 -0.1565762 0.1534797 0.1550279
## [7,] -0.0044873516 -0.1595464 0.1505717 0.1550590
## [8,] -0.0026710166 -0.1577834 0.1524413 0.1551124
## [9,] -0.0020264716 -0.1571475 0.1530946 0.1551210
## [10,] -0.0028325150 -0.1579596 0.1522945 0.1551271
## [11,] -0.0024005432 -0.1575290 0.1527279 0.1551285
## [12,] -0.0021521315 -0.1572810 0.1529768 0.1551289
So far the key insights and that we have from our analysis and forecasting is:
The variables maximum temperature and consumer sentiment seem to have an impact on our dependent variable sales of unit boxes, however consumer sentiment has an impact only in a certain period.
We can identify a pattern of variance in sales at the beginning of the year, where we have the biggest positive(March) impact as well as negative(February) impact on sales, by the end of the year the sales seem to stabilize.
Taking these observations, we can conclude consumer sentiment might be the variable impacting the most on sales the first part of year, as it shows in our results for lag 1, as well maximum temperature impacts most of the year, as we know consumer sentiment represents how the consumer is feeling in terms of their finances and the state of economy. So the recommendation would be take in consideration this variable at the months where we will expect a negative impact on sales (January,February, May, July and August ), to make special offers and discounts that can be accessible and cheaper for our consumer inside supermarkets and convenience stores, in order to make the consumer feel like he doesn’t have to spend much money on it’s favorite products.